CHARACTERIZATION OF WORST-CASE GMRES 
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Abstract. Given a matrix A and iteration step k, we study a best possible attainable upper 
bound on the GMRES residual norm that does not depend on the initial vector b. This quantity 
is called the worst-case GMRES approximation. We show that the worst case behavior of GMRES 
for the matrices A and A T is the same, and we analyze properties of initial vectors for which the 
worst-case residual norm is attained. In particular, we show that such vectors satisfy a certain "cross 
equality", and we characterize them as right singular vectors of the corresponding GMRES residual 
matrix. We show that the worst-case GMRES polynomial may not be uniquely determined, and 
we consider the relation between the worst-case and the ideal GMRES approximations, giving new 
examples in which the inequality between the two quantities is sharp at all iteration steps k > 3. 
Finally, we give a complete characterization of how the values of the approximation problems in 
the context of worst-case and ideal GMRES for a real matrix change, when one considers complex 
(rather than real) polynomials and initial vectors in these problems. 
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1. Introduction. Let a nonsingular matrix A <G R nxn and a vector b € K n 
be given. Consider solving the system of linear algebraic equations Ax = b with 
the initial guess xq = using the GMRES method [11]. This method generates a 
sequence of iterates Xk G K, k (A, b) = span{6, Ab, . . . A k ~ 1 b}, k = 1,2, ... , so that the 
corresponding fcth residual r k = b — Axk satisfies 

(1.1) IMI = min \\p(A)b\\. 

Here || • || denotes the Euclidean norm, and 717, denotes the set of real polynomials 
of degree at most k and with value one at the origin. Note that for a real matrix A 
and a real right hand side b the minimum in (jl.lj) is achieved for a real polynomial. 
Considering only real polynomials therefore does not represent any restriction. 

It is clear from (jl.ip . that the sequence of GMRES residual norms ||rfc||, fc = 
1,2,... , is nonincreasing. It terminates with r k = if and only if k is equal to 
d(A, b), the degree of the minimal polynomial of the vector b with respect to A. For 
each b we have d(A, b) < d(A), the degree of the minimal polynomial of A. 

A geometric characterization of the iterate Xk 6 K, k (A, 6), which is mathematically 
equivalent to (|1.1|) , is given by 

(1.2) r k _L AIC k (A,b). 

To emphasize the dependence of the fcth GMRES residual r k on the given data A, b 
and k we will sometimes write 



r k = GMRES(A, b, k) or r k - p k (A)b, 
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where p^. € Hk is the fc^ 1 GMRES polynomial of A and 6, i.e., the polynomial that 
solves the minimization problem on the right hand side of (jl.ip . As long as rj, ^ 0, 
this polynomial is uniquely determined. The matrix pk(A) is called the kth GMRES 
residual matrix of A and b. For further basic properties and algorithmic details of the 
GMRES method we refer to the original paper [11] or the books (2j [8l [10] . 

In the following we will assume without loss of generality that ||6|| = 1. A com- 
mon approach for investigating the GMRES convergence behavior is to bound (jl.ip 
independently of b. For each iteration step fc the best possible bound on the GMRES 
residual norm that is independent of b is given by maximizing the right hand side of 
(11. ip over all unit norm vectors, i.e., 

(1.3) ||rfc|| = min ||p(A)6|| < max min II p(A)v II = ^k(A). 

pe7r fc IMI=i pe-!r fc 

The quantity ^Sk(A) is called the kth worst-case GMRES approximation. It is easy 
to see that the bound (|1.3j) is sharp in the sense that for each given A and fc there 
exists a unit norm vector b so that the corresponding fcth GMRES residual vector 
satisfies = $>k(A). We will call such a vector b, the corresponding fcth GMRES 
polynomial pk and the corresponding fcth GMRES residual matrix Pk(A) the kth 
worst-case GMRES initial vector, polynomial and residual matrix, respectively. If A 
is singular, then ^k(A) = 1 for all fc > (to see this, simply take b as a unit norm 
vector in the kernel of ^4) . Hence only the case of a nonsingular matrix A is of interest 
in this context. For such A we have 

1 > > • • ■ > V d(A )-i(A) > * d {A)(A) = 0, 

and therefore we only need to consider 1 < fc < d(A) — 1. 

It is known that tyk(A) for a fixed fc is a continuous function on the open set of 
nonsingular matrices; see Theorem 3.1] or [1] Theorem 2.5]. Moreover, it was shown 
in [H Theorem 2.7] that ^k(A) = 1 for a nonsingular matrix A, if and only if zero 
is contained in some generalized field of values derived from the powers I, A, ... , A k . 
Most of the other previously published results on worst-case GMRES are devoted to 
studying the tightness of the inequality 

(1.4) * fc (A) < nun||p(A)|| =(p k (A), 

pewit 

which is easily derived from (|1.3|) using the submultiplicativity property of the Eu- 
clidean norm. The quantity ifik(A) is called the kth ideal GMRES approximation 
The polynomial for which the minimum is attained in (|1.4[) is called the kth ideal 
GMRES polynomial of A. This polynomial is uniquely determined; see [H [5]- It 
was shown that (|1.4[) is an equality for normal matrices A and all fc > 0, and for 
fc = 1 and any nonsingular A j3] [6]. Some nonnormal matrices A are known for which 
^k(A) < Lpk(A), even ^ k (A) < <Pk(A), for certain fc; see PQQ3]. 

The ideal GMRES approximation problem can be formulated as a semidefinite 
program (see [13]) and hence can be solved efficiently by standard software. On 
the other hand, we are unaware of any efficient algorithm for solving the worst-case 
GMRES approximation problem, so that in practice one needs to resort to a "general 
purpose" nonlinear solver to compute worst-case GMRES data. The difficult nonlinear 
nature of the worst-case GMRES approximation problem may be one of the reasons 
why this problem is less studied (both theoretically and numerically) than the ideal 
GMRES approximation problem. 
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This paper is mainly devoted to characterizations of the worst-case GMRES prob- 
lem (|1.3p . We first show in Section 2 that ^k(A) = \l/fc(A T ), and that worst-case initial 
vectors satisfy a certain "cross equality". Next, in Section 3, we look at the worst-case 
GMRES approximation problem from the optimization point of view and show that 
fcth worst-case GMRES initial vectors are always right singular vectors of the corre- 
sponding fcth GMRES residual matrix. In Section |4] we prove that a fcth worst-case 
GMRES polynomial may not be uniquely determined (unlike the fcth ideal GMRES 
polynomial), and we give a numerical example for two different polynomials and cor- 
responding initial vectors that both attain the same worst-case GMRES value at the 
same step fc. In Section[5]we further study differences between the worst-case and the 
ideal GMRES approximations. In particular, we state a parameterized set of matrices 
A of arbitrary size 2n (with n > 2) for which the inequality in (|1.4p is sharp for all 
k = 3, . . . , In — 1. In the previously published examples in [TJ I13| . a small matrix 
A is constructed for which the sharp inequality occurs for exactly one fc. Finally, in 
Section [5] we analyze whether the values of the max-min approximation (| 1 .3[) and the 
min-max approximation (|1.4| for a real matrix change if we consider the maximiza- 
tion over complex vectors and/or the minimization over complex polynomials. This 
analysis gives another indication for the difference between the two approximation 
problems. 

2. The cross equality. In this section we generalize two results of Zavorin |15) . 
The first shows that ^f k (A) = ^k(A T ) and the second concerns a special property 
of worst-case initial vectors (they satisfy the so-called "cross equality"). Note that 
Zavorin proved these results only for diagonalizable matrices using quite a complicated 
technique based on the decomposition of the corresponding Krylov matrix. Using a 
simple algebraic technique we prove these results for general matrices. All results 
presented in this section can easily be generalized from real to complex matrices. 

Theorem 2.1. If A e R ,lXTl is a nonsingular matrix, then ^k(A) = ^k(A T ) for 
all k = l,...,d(A) - 1. 

Proof. Let 1 < k < d{A) — 1 and consider any unit norm vector b such that 
the corresponding fcth GMRES residual vector r^ = pk(A)b is nonzero. The defining 
property (|1.2| of rk means that (A^b, r k ) = for j = 1, . . . , k. Hence, for any q G 7Tfc, 

(2.1) ||r fe || 2 = (p k {A)b,r k ) = (b,r k ) = (q(A)b,r k ) = (b,q(A T )r k ) < \\q(A T )r k \\, 

where the last inequality follows from the Cauchy-Schwarz inequality and ||6|| = 1. 

If b is a unit norm fcth worst-case GMRES initial vector and r k is the correspond- 
ing fcth GMRES residual vector, then the previous inequality means that 

(2-2) INI 2 = *l(A) < \\q(A T )r k l 

where q £ ir k is arbitrary. Dividing by \\r k \\ and taking the minimum over all q G n k 
we get 



(2.3) V k (A) < min 

967TJC 



9(^)1 



r k 



< ^k(A T ). 



Now we can reverse the roles of A and A T , and then repeat the whole argument to 
obtain the opposite inequality, i.e., ^ k {A T ) < ^ k (A). □ 

The following theorem describes a special property of worst-case initial vectors: 
If we apply GMRES to A and a unit norm fcth worst-case initial vector b giving at 
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step k the residual vector r k , and then k steps of GMRES to A T and the initial vector 
r fc/ll r fc||i we obtain again the original initial vector b (up to a scaling factor). 

Theorem 2.2. Let A 6 R" xn be a nonsingular matrix, and let 1 < k < d{A) - 1. 
// b G M. n is a unit norm kth worst-case GMRES initial vector and 

r k = GMRES (A, 6, k), s k = GMRES ( A T , -f^-, k 



\\r k \ 



then 



\s k \\ = \\r k \\ = * k (A) and f 



^k(A)- 



Proof. Let 6 be a unit norm fcth worst-case GMRES initial vector and let r k = 
GMRES (A, b, k). In addition, let s k = GMRES(A T , r k /\\r k \\,k) and let q k be the 
corresponding fcth GMRES polynomial. Using this polynomial in (|2.2p yields 



\\r k \\=V k (A) < 



Wk\ 



\s k \\ < * k (A T ). 



However, as shown in Theorem 12. 11 equality holds throughout, which shows the first 
assertion. 

Moreover, since ||rfe|| = ||sfe|[, the (Cauchy-Schwarz) inequality on the right of 
(|2.ip is an equality for the given b and q = q k) i.e., 

(b,q k (A T )r k ) = \\q k (A T )r k \\. 

Since ||6|| = 1, this happens if and only if 

q k (A T )r k q k {A T )r k s k 



\\q k (AT)r k \\ \\r k \\\\r k \\ ||r fc ||' 

which finishes the proof. □ 

The previous theorem shows that if b is a unit norm fcth worst-case GMRES initial 
vector, then (with the same notation as in the proof above) 

V k (A)b = s k = q k (A T ) 1 p^ = q k (A T ) Pk (A)- l b 



IMI ^ v '\\r k \\' 
or, equivalently, 

(2.4) q k (A T ) Pk (A)b = #l(A)b. 

In other words, b is an eigenvector of the matrix q k (A T )p k (A) with the corresponding 
eigenvalue ^I(A). In Corollary 13.71 we will show that q k = p k , i.e., that b is a right 
singular vector of the fcth worst-case GMRES residual matrix p k (A). 

To further investigate vectors with the special property introduced in Theorem l2.2l 
we use the following definition. 

Definition 2.3. Let A G R nx ™ be nonsingular. We say that a unit norm vector 
b G W with d(A, b) > k satisfies the cross equality for A and the step k > 1, if 

u - Sk where s k = GMRES ( A T , k] , r k = GMRES(A, b, fc). 



V 'IK 



Algorithm 1 (Cross iterations 1) 



6<°> = b, 

for j = 1,2,... do 

r[ i} =GMRES(A,6 (J - I \fc) 
c O--i) =r O') / || r W|| 

s[ j) = GMRES(A T , c (:,_1 ) , k) 

& w = s w / ii4 j) ii 

end for 



Inspired by Theorem 12.21 we define the iterative process shown in Algorithm [TJ 
To analyze this algorithm, let us denote 

r[ j) = (A)bV-V and = (A T )c^~ 1) . 

Using q = q^f 1 in (|2.ip we then get 

ll^f < II^V^II = llr^HH^^V- 1 )!! = ||r«|||| S «||. 
Now consider (|2.1|) with the roles of A and A T reversed, i.e., 

^f = (£ , {A T )^,^ ) ) = q{A)a<f) = |] — £° || q(A)W) 

<\\s^\\ \\q(A)b«\ 

for all q 6 7Tfc. We can choose g = pjjf +1 ^ and thus obtain ||s^|| < In 
summary, we have shown that 

(2.5) llr^H < \\sP\\ < \\r[ j+1) \\ < \\s^ +1) \\ < * fc (A), j = l,2,... . 

Hence the sequences of norms ||r£ || and \\su ||, j = 1,2,..., interlace each other, 
are both nondecreasing, and are both bounded by ^(A). This implies that both 
sequences converge to the same limit, which does not exceed Vl'fc(A). 

Consequently, for any initial vector b^ ' , Algorithm Q] converges to a vector that 
satisfies the cross equality for A and step k. If satisfies the cross equality for 
A and step k, then trivially equality holds in (|2.5p for all j. On the other hand, if 
equality holds in (|2.5p for one j, then, using (|2.1[) . 

(W,qjP{A T )rjp) = ||r«|| 2 = \\q^(A T )r^\\ = 

and we have reached a vector that satisfies the cross equality. 

From the above it is clear that the cross equality represents a necessary condition 
for a vector b^ ' to be a worst-case initial vector. On the other hand, we can ask 
whether this condition is sufficient, or, at least, whether the vectors that satisfy the 
cross equality are in some sense special. To investigate this question we present the 
following lemma. 

Lemma 2.4. Let A e R nx ™ be nonsingular, k > 1, and b G W l be a unit norm 
initial vector with d(A,b) > k. If = GMRES(A b, k), then d(A T ,rk) > k, and b 
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Fig. 2.1. Cross iterations for random initial vectors. 

satisfies the cross for A and the step k if and only ifb^ K. k+ i(A T ', r k ). In particular, 
each unit norm vector b with d(A, b) = n satisfies the cross equality for A and the step 
k = n — 1 . 

Proof The nonzero GMRES residual r k <E b + AlC k (A, b) C Kk+i(A, b) is uniquely 
determined by the orthogonality conditions (|1.2|) . which can be written as 

0= (A*b,r k ) = (b,(A T yr k ), for j = l,...,k, 

or, equivalently, 

(2.6) b± A T IC k {A T ,r k ). 

Now let s k = GMRES(A T ,r fc /||r fe ||,fc). From we know that ||s fe || > ||r fc || > 0, 

1. e. d(A T ,r k ) > k, and 

(2.7) s k e 1 p^+A T IC k (A T ,r k ) c)C k+1 (A T ,r k ), s k J_ A T K k (A T , r k ) . 

If b satisfies the cross equality for A and the step k, then b = s k /\\s k \\ and (12.71) 
implies that b e )C k +i(A T , r k ). On the other hand, if b G K, k +\ (A T , r k ), then (b, r k ) = 
\\r k \\ 2 ^ and flU} imply that b = s k /\\s k \\. 

For k = n — 1, we have lC k +i(A T ,r k ) = R™, i.e. b e IC k +i(A T , r k ) is always 
satisfied. □ 

To give a numerical example for Algorithm Q] we consider A being the Jordan 
block J\ of size 11 with the eigenvalue A = 1, and we and choose k = 5. In this case, 
the ideal GMRES matrix <f5(A) has a simple maximal singular value, as numerically 
observed in |12| . Using the results of Greenbaum and Gurvits in [3] we know that then 
^5(J\) = ^(^a), and, moreover, that the corresponding worst-case initial vector is 
the right singular vector that corresponds to the maximal singular value of the ideal 
GMRES matrix (A) . Hence, in this case the 5th worst-case initial vector is uniquely 
determined up to scaling. 

In the left part of Fig. 12.11 we show the results of Algorithm [T] started with 20 
random unit norm initial vectors. Each line represents the sequence Hvj^H, ||sj^||, for 
j = 1, ... ,10. In the end of each of the 20 runs we get a vector that satisfies (up to a 
small inaccuracy) the cross equality for J\ and k = 5. We can observe that there are 



many initial vectors that satisfy the cross equality, and there seems to be no special 
structure in the norms that are attained in the end. In particular, none of the 20 runs 
results in a 5th worst-case initial vector for which the norm ^5 (A) is attained (this 
value is visualized by the highest horizontal line in the figure). 

We will now slightly modify the cross iteration Algorithm [T] Having a initial 
vector 6^' _1 ) we always apply both, GMRES with A as well as GMRES with A T , and 
look at the resulting GMRES residual norm. We take as a resulting residual the one 
with the greater norm; see Algorithm^ After the process converges, we get again a 
vector that satisfies the cross equality. 



Algorithm 2 (Cross iterations 2) 



6<°> = b, 

for j = 1, 2, . . . do 

v = GMRES(A,&W- 1 ),fc) 
w = GMRES(A T ,b^- 1 \k) 
if ||u|| < then 



else 



tP = w 



end if 
end for 



b^=t^/\\t^\ 



This strategy is a little better than the original one when looking for a worst- 
case initial vector; see Fig. 12.11 While it is usually not sufficient to find a worst-case 
vector, one at least can find a reasonable initial point for an optimization procedure 
that solves the nonlinear worst-case GMRES approximation problem. 

3. Optimization point of view. Let a nonsingular matrix A <G W ixn and a 

positive integer k < d(A) be given. For vectors c = [ci, . . . , Ck] T € K fc and v € R™, we 
define the function 

(3.1) f(c, v) = \\p(A; c)v\\ 2 = v T p(A; c) T p(A; c)v, 
where 

k 

p(z;c) = 1 -J2 c j zJ - 

Equivalently, we can express the function f(c,v) using the matrix 

K(v) = [Av, A 2 v, . . . , A k v] 

as 

(3.2) f(c, v) = \\v - Jv (^)c|| 2 = v T v - 2v T K(v)c + c T K{v) T K{v)c. 

(Here only the dependence on v is expressed in the notation K(v), because A and 
k are both fixed.) Note that K(v) T K(v) is the Gramian matrix of the vectors 
Av,A 2 v, . . -,A k v, 

K(v) T K(v)=[v T (A T yA^v] tj=1 _ k . 
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Next, we define the function 

g(v) = min f{c,v), 

cGR fc 

which represents the fcth squared GMRES residual norm for the matrix A and the 
initial vector v, and we denote 

n = {u G R™ : d{A, u) > fe}, r = {u G R" : it) < fc}. 

The set L is a closed subset, f2 is an open subset of K™, and ffi™ = flUT. Note that 
g(v) > for all v € fl and g(u) = for all v G P. The following lemma is a special 
case of [1] Proposition 2.2] for real data and nonsingular A. 

Lemma 3.1. In the previous notation, the function g(v) is a continous function 
of v G R", i.e., g G C°(R"), and it is an infinitely differ entiable function of v € ft, 
i.e., g G C°°(fi). Moreover, T has measure zero in R". 

We next characterize the minimizer of the function /(c, v) as a function of v. 

Lemma 3.2. For each given v G tt, the problem 

min f(c, v) 

ceR fc 

has the unique minimizer 

7(v) = {KivfKiv^Kivfv G R fc . 

As a function of v G tt, this minimizer satisfies 'y(v) G C°°(Q). Given »£!!, (7(^), w) 
is i/ie on/y pomi in R fc x O wii/i 

V c /(7(«),«) = 0. 



Proof. Since veO and A is nonsingular, the vectors Av, A 2 v, . . . A k v are linearly 
independent and K(v) T K(v) is symmetric and positive definite. Therefore, if v G O 
is fixed, (|3.2p is a quadratic functional in c, which attains its unique global minimum 
at the stationary point 

7(u) = (if (lO^O))- 1 /^) 2 ^. 

The function 7(1;) is a well defined rational function of v G fl, and thus j(v) G C°°(f2). 
Note that the vector 7(17) contains the coefficients of the fcth GMRES polynomial that 
corresponds to the initial vector v G fi. 

As stated in Lemma |3.1[ g(v) is a continuous function on R", and thus it is also 
continuous on the unit sphere 

S = {u G R" : ||u|| = 1}. 

Since S is a compact set and g(v) is continuous on this set, it attains its minimum 
and maximum on S. 

We are interested in the characterization of points (c, v) G R fe x S such that 

(3.3) f(c, v) = max min /(c, v) = maxglv). 

veS cGR fc v£S 
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This is the worst-case GMRES problem (|1.3[) . Since g{v) = for all v G T, we have 

raaxgfu) = max <?(u)- 

To characterize the points (c, ii) £ l fc x S that satisfy (|3. 3[) , we define for every c€t' 
and v 7^ the two functions 

Clearly, for any a ^ 0, we have 

F(c,av) =F(c,v), G(av) 
Lemma 3.3. It holds that G(v) G C°°(f2). ^4 uecior v eiln S satisfies 
g(v) > 5(f) for all dgS 
i/ and on/?/ j/ £> G PI S satisfies 

G{v) > G{v) for all v G K n \{0}. 

Proof. Since G C°°(Q) and £ fi, it holds also G C°°(f2). If w £ Q n 5" 
is a maximum of G(v), then cw is a maximum as well, so the equivalence is obvious. 
□ 

Theorem 3.4. The vectors c G M fe and v e S Oil that solve the problem 

max min f(c, v) 

veS ceR™ 

(3.4) V c F(c,{)) = 0, V w F(c,u)=0, 

i.e., (c, u) is a stationary point of the function F(c, v). 
Proof. Obviously, for any v G O, 

F( 7 (,) J ,) = ^^<%^= J F(c J ,) for all erf, 

IT ?J U x V 

i.e., 7(f) also minimizes the function F(c,v) and that 

V c F( 7 (u),u) = o, uea 

We know that (7(1;) attains its maximum on S at some point 5 e fl S 1 . Therefore, 
G(w) attains its maximum also at v. Since G(v) £ C°°(r2), it has to hold that 

VG(v) = 0. 

Denoting c = 7(w) and writing the function G(v) as G(v) = F^^v), v) we get 

(3.5) VG(v) = = V v j{v)V c F{c, v) + V v F(c, v), 
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where V^^) is the n x k Jacobian matrix of the function -f(v) : R™ — > R fe at the 
point v. Here we used the standard chain rule for multivariate functions. Since 
v G SI n S, we know from the previous that V c F(c, = 0, and, therefore, using (|3.5D , 
Vt,F(c,ii) = 0. □ 

Theorem 3.5. // (c, u) is a solution of the problem (3. Sty , then v is a right 
singular vector of the matrix p(A; c). 

Proof. Since (c,v) solves the problem (|3.3p . we have = V v F(c,v). Writing 
F(c,v) as a Rayleigh quotient, 

_ v T p(A; c) T p(A, c)v 

r {C, V) , 
V 1 V 

we ask when W v F(c,v) = 0; for more details see pp. 114-115]. By differentiating 
F(c, v) with respect to v we get 

_ 2p(A; c) T p(A, c)w ||w|| 2 - 2i> T p(A; c) T p(A, c)v v 
(v T v) 2 

and the condition = \/ v F(c, v) is equivalent to 

p(A; c) T p(A, c)v = F(c, v) v. 

In other words, v is a right singular vector of p{A\ c) and a = y -F(c, u) is the corre- 
sponding singular value. □ 

Theorem 3.6. ^4 point (c, u) € M fc xS" ^/iai solves the problem \3.S\) is a stationary 
point of F(c,v) in which the maximal value of F(c,v) is attained. 

Proof. Using Theorem 13.41 we know that any solution (c, S of p.3[) is a 

stationary point of F(c,v). On the other hand, if (c,v) el'xS satisfies 

V v F(c,v) = 0, V c F(c,v)=0, 

then p(A; c) is the GMRES polynomial that corresponds to v and 

F(c,v) = \\ P (A;c)v\\ 2 < \\p{A-~c)v\\ 2 = F(c,v). 

Hence, (c, v) is a stationary point of F(c, v) in which the maximal value of F(c, v) is 
attained. □ 

As a consequence of previous results we can formulate the following corollary. 

Corollary 3.7. Let A e IR™ X " be a nonsingular matrix and let 1 < k < d(A) — 1. 
Let b be a kth unit norm worst-case GMRES initial vector and let pk £ TXk be the 
corresponding kth worst-case GMRES polynomial. Then pk is also the kth worst-case 
GMRES polynomial for A T and the initial vector rfc/||rfc||. 

Proof. Using Theorem 13.51 and Theorem 13.61 we know that 

(3.6) 9ftA)b = p k (A T )p k (A)b, 

i.e., that b is a right singular vector of the GMRES residual matrix pk(A) that cor- 
responds to the maximal value of F(c,v), i.e., to ^(A). From (|2.4p we also know 
that 

(3.7) *l(A)b = q k (A T )p k (A)b 

where q k is the GMRES polynomial that corresponds to A T and the initial vector 
r-fc. Comparing (|3.6p and (|3.7p . and using the uniqueness of GMRES polynomials it 
follows that pk = Qk- D 
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4. Non- uniqueness of worst-case GMRES polynomials. In this section we 
prove that a worst-case GMRES polynomial may not be uniquely determined, and 
we give a numerical example for the occurrence of a non-unique case. Our results are 
based on Ton's parameterized family of (nonsingular) matrices 



(4.1) A = A(u,e) = 



1 



-1 



< u < 2, < e. 



Toh used these matrices in Q2] to show that \& '3(A) / '<p%(A) — > for e — ► and each 
lo € (0, 2) [T31 Theorem 2.3]. In other words, he proved that the ratio of the worst-case 
and ideal GMRES approximations can be arbitrarily small. 

Theorem 4.1. If Pk{z) is a kth worst-case GMRES polynomial of A in (|4.1j) . 
then pk(—z) is also a kth worst-case GMRES polynomial of A. 

In particular, ps(z) ^ p^—z), so the third worst-case GMRES polynomial of A 
is not uniquely determined. 

Proof. Let b be any unit norm fcth worst-case initial vector of A, and consider the 
orthogonal similarity transformation 



A = -QA T Q T , Q = 



1 



-1 



-1 



Then 



Pk (A)b = Q Pk (-A T )Q T b and * fc (A) = \\ P k(A)b\\ = \\p k (-A T )w\\ = * fe (A T ), 

where w = Q T b. In other words, Pk{~z) is a fcth worst-case GMRES polynomial for 
A T and, using Corollary 13.71 it is also a kth worst-case GMRES polynomial for the 
matrix A. 

Let p${z) £ 7T3 be any third worst-case GMRES polynomial for the matrix A. To 
show that pz(— z) ^ pz{z) it suffices to show that p^{z) contains odd powers of z, i.e., 
that 



(4.2) 

Define the matrix 



Ps(z) 7^ 1 — (iz 2 for any f3 S 



B = 



1 uj 



1 



uj 

1 
1 



= A\ 



From [T31 Theorem 2.1] we know that the (uniquely determined) third ideal GMRES 
polynomial of A is of the form 



(4.3) 

Therefore, 



p»(z) = 1 + (a - l)z 2 , 



2uj 2 



min ||p(j4)|| = min max ||p(_B)w|| = max min ||p(B)t;||, 

p£7T3 P67T1 ||«|| = 1 ||u|| = lpGiri 
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where the last equality follows from the fact that the ideal and worst-case GMRES 
approximations are equal for k — 1 [3] ■ If a third worst-case polynomial of A is of 
the form 1 — f3z 2 for some /3, then 

ty 3 (A) = max min ||jj(j4)u|| = max min ||p(_B)u|| = min ||p(A)|| = tp 3 {A). 

||u|| = l peir 3 I v\\ = l pEn-i PG7T3 

This, however, contradicts the main result by Toh that ^3 (A) < ip 3 (A); see [TBI 
Theorem 2.2]. □ 

To compute examples of worst-case GMRES polynomials for the Toh matrix (|4. 1 [) 
numerically we chose e = 0.1 and to = 1, and we used the function fminsearch from 
Matlab's Optimization Toolbox. We computed the value 

V 3 {A) = 0.4579 

(we present the numerical results only to 4 digits) with the corresponding third worst- 
case initial vector 

b = [-0.6376, 0.0471, 0.2188, 0.7371] T 
and the worst-case GMRES polynomial 

p 3 (z) = -0.025z 3 - 0.895z 2 + 0.243z + 1 = ^0 - 1.181)(z + 0.939)(z + 35.96). 

One can numerically check that b is the right singular vector of p 3 (A) that corresponds 
to the second maximal singular value of p 3 (A). From Theorem 14.11 we know that 
q 3 {z) = p 3 (— z) is also a third worst-case GMRES polynomial. One can now find the 
corresponding worst-case initial vector leading to the polynomial q 3 using the singular 
value decomposition (SVD) 

P3 (A) = USV T , 

where the singular values are ordered nonincreasingly on the diagonal of S. We know 
(by numerical observation) that b is the second column of V. We now compute the 
SVD of 93(A), and define the corresponding initial vector as the right singular vector 
that corresponds to the second maximal singular value of q 3 (A). It holds that 

P3 (A T ) = P3 (Af = VSU T . 

Since A T = ~QAQ T , we get Qp 3 (-A)Q T = VSU T , or, equivalently, 

q 3 (A) = (Q T V)S(Q T U) T . 

So, the columns of the matrix Q T U are right singular vectors of q 3 (A) and the vector 
Q T U2, where U2 is the second column of U, is the worst-case initial vector that gives 
the worst-case GMRES polynomial q 3 (z) =p 3 (—z). 

5. Ideal versus worst-case GMRES phenomenon. As mentioned above, 
Toh [13] as well as Faber, Joubert, Knill, and Manteuffel [1] have shown that worst- 
case GMRES and ideal GMRES are different approximation problems in the sense 
that there exist matrices A and iteration steps k for which ^k(A) < ifk(A). In 
this section we further study these two approximation problems. We start with a 
geometrical characterization related to the function /(c, v) from 
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Theorem 5.1. Let A g R nx " be a nonsingular matrix and let 1 < k < d(A) - 1. 
The kth ideal and worst-case GMRES approximations are equal, i.e., 



(5.1) 



max min f(c, v) = min max /(c, v), 

veS cGR fc cGK fc ties 



if and only if f(c,v) has a saddle point in R fe x S. 

Proof. If /(c, v) has a saddle point in WL k x S, then there exist vectors ceK* and 
v G S such that 

/(£,«) < /(£,«) < /(c,t5) VcGK fc , Vv G S. 

The condition /(c, v) < /(c, u) for all v £ S implies that v is a maximal right singular 
vector of the matrix p(A;c). If f(c,v) < f(c,v) for all c G K fc , then p(z;c) is the 
GMRES polynomial that corresponds to the initial vector v. In other words, if /(c, v) 
has a saddle point in R k x 5*, then there exist a polynomial p(z; c) and a unit norm 
vector u such that v is a maximal right singular vector of p(A; c) and 

p(A; c)v 1 

Using Lemma 2.4], the fcth ideal and worst-case GMRES approximations are then 
equal. 

On the other hand, if the condition (|5.1[) is satisfied, then f(c,v) has a saddle 
point in R fe x S. □ 

In other words, the kth ideal and worst-case GMRES approximations are equal 
if and only if the points (c, v) G K fe x S that solve the worst-case GMRES problem 
are also the saddle points of f(c, v) in R fc x S. 

We next extend the original construction of Toh [13 to obtain some further nu- 
merical examples in which ^k(A) < ipk(A). Note that the Toh matrix (|4.ip is not 
diagonalizable. In particular, for u> = 1 we have A = XJX~ l , where 



1 1 
1 



-1 1 



1 



X 



c 

-2 





e e — e 

-1 1 

-2e 2e 

4 



One can ask whether the phenomenon ^k{A) < <fk(A) can appear also for diago- 
nalizable matrices. The answer is yes, since both $>k(A) and <Pk{A) are continuous 
functions on the open set of nonsingular matrices; see [TJ Theorem 2.5 and Theo- 
rem 2.6]. Hence one can slightly perturb the diagonal of the Toh matrix (|4.1[) in order 
to obtain a diagonalizable matrix A for which ^>k(A) < (pk(A). 

For w = 1, the Toh matrix is an upper bidiagonal matrix with the alternating 
diagonal entries 1 and — 1, and the alternating superdiagonal entries e and e _1 . One 
can consider such a matrix for any n > 4, i.e., 



A 



1 



-1 e" 1 

1 e 



±1 



13 



10' 



---ideal GMRES 




—©—worst-case GMRES 





number of iterations k 



Fig. 5.1. Ideal and worst-case GMRES can differ from step 3 up to the step 2n — 1. 



and look at the values of ^k(A) and <pk(A). If n is even, we found numerically that 
^k(A) = (fik(A) for k 7^ n— 1 and \& n _i(.A) < (p n ^i(A). If n is odd, then our numerical 
experiments showed that ^k{A) = (fik(A) for k ^ n — 2 and <I' n _2(A) < <^„_2(^4)- 
Hence for all such matrices worst-case and ideal GMRES differ from each other for 
exactly one k. 

Inspired by the Toh matrix, we define the n x n matrices (for any n > 2) 



A 











and use them to construct the matrix 

Jl. e UlE £ 

J-u 



LJ > 0. 



One can numerically observe that here ^k(A) < (fik{A) for all steps k = 3, . . . , 2n— 1. 
As an example, we plot in Fig. 15. li the ideal and worst-case GMRES convergence curves 
for n = 4, i.e., A is an 8 x 8 matrix, to = 4 and e = 0.1. Varying the parameter 10 will 
influence the difference between worst-case and ideal GMRES in these examples. 

6. Ideal and worst-case GMRES for complex vectors or polynomials. 

We now ask whether the values of the max-min approximation (| 1 . 3[) and the min-max 
approximation (| 1 .4[) for a matrix A £ R™ xn can change if we allow the maximization 
over complex vectors and/or the minimization over complex polynomials. The an- 
swer to this question will show that the two approximation problems indeed are of a 
different nature. 
Let us define 



<Pk,K,v(A) 



2 lin ffi^ lb( A ) b H' 



^k,K,¥(A) 



max min ||p(A)6||, 

IIMI=i 



where IK and F are either the real or the complex numbers. Hence, the previously 
used ifk(A), *fc(A), and 7r fc are now denoted by (^rj^A) and ^k,s.,m.(A), and 7r fciR , 
respectively. We first analyze the case of <fk,K,F(A). 
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Theorem 6.1. For a nonsingular matrix A £ R" xn and 1 < k < d(A) - I, 

¥>fc,R,K(^) = ^fc,C,R(^) = <Pk,M.,c(A) = Vk,C,c(A). 



Proof. Since 



max \\Bv\\ = LB = max \\Bv\ 

i>em™ t>ec" 

lli>ll=l lli>ll=l 



holds for any real matrix B s M™ xn , we have fk,R,R{A) = ifik,R,c(A). 

Next, from KcCwe get immediately <pk,c,m.(A) < (fik,R,R(A). On the other hand, 
writing p G ftk.c m the form p = p r + ipi, where p r £ ftk,R and is a real polynomial 
of degree at most k such that Pi{0) = 0, we get 



<^ CR (A)= min max ||p(A)6|j 2 = min max (||p r (A)6|| 2 



> min max ||p r (v4)&||^ 

" II & 1 1 = 1 



P,(A)b\\ 2 ) 
(A), 



so that <Pk,c,m(A) = ¥>fc,R,R(^4)- Finally, from [SJ Theorem 3.1] we obtain tpk,m,m(A) = 
fk,C,c(A). □ 

Since the value of fk,K,w(A) does not change when choosing for IK and F real or 
complex numbers, we will again use the simple notation fk(A) in the following text. 
The situation for the quantities corresponding to the worst-case GMRES approxima- 
tion is more complicated. Our proof of this fact uses the following lemma. 

Lemma 6.2. If A = A{uj,e) is the Toh matrix defined in (|4.ip and 
(6.1) B 



A 
A 



then * 3 ,r,r(S) = <p 3 (A). 

Proof. Using the structure of B it is easy to see that *f>k,R,R(B) < (fk(A) for 
any fc. To prove the equality, it suffices to find a real unit norm vector w with 



(6.2) 



min ||p(_B)ti;|| = ipz{A) = min 



The solution p» of the ideal GMRES problem on the right hand side of (|6.2[) is given by 
(14. 3|) . Toh showed in |131 p. 32] that p*(A) has a twofold maximal singular value a, and 
that the corresponding right and left singular vectors are given (up to a normalization) 

by 



[vi,v 2 ] 





c 


-2 



c 

-2 






2 


— c 



2 


— c 




i.e., aui = p*(A)v\ and <jm 2 =P*(A)v 2 , where a = \\p*(A)\ 
Let us define 



t'2 



< ! 1 
t'2 



q(z) =p*(z). 
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Using 



1(B) 





= a 


til 


and 




Vl 








Ul 




V2 




U2 






V2 








U 2 





Ui 


T 









Vl 


U2 







A> 




V2 



we see that = a. To prove (|6.2p it is sufficient to show that q is the third 

GMRES polynomial for B and w, i.e., that q satisfies q(B)w _L for j = 1,2,3, 
or, equivalently, 

r r 44 n i r .. i 

= u{A 3 vi +ulA : >v2 = 0, j'= 1,2,3. 

Using linear algebra calculations we get ujAvi = —4c = —uT^Av-i, and 

= u\A 2 vi = u^A 2 v 2 = uf^ 3 «i = u\ A s v 2 . 

Therefore, we have found a unit norm initial vector w and the corresponding third 
GMRES polynomial q such that \\q{B)w\\ = <p 3 (A). □ 

We next analyze the quantities ^k,K,¥(A). 

Theorem 6.3. For a nonsingular matrix A 6 R nxn and 1 < k < d(A) - 1, 
*fc,R,R(A) = $ k ,C,m( A ) < * fc ,c,c(A) < *fc, R ,c(^) , 
where both inequalities can be strict. 

Proof. For a real initial vector 6, the corresponding GMRES polynomial is 
uniquely determined and real. This implies 'I'fc.c.R^) = ^k,R,R{A). Next, from 
Theorem 3.1] it follows that ^ k,M.,m.(A) < *bk,c,c(A). Finally, using R C C we get 
*fe,c,c(A) < *fc,M, C (^). 

It remains to show that the inequalities can be strict. For the first inequality, 
as shown in |16l Section 4], there exist real matrices A and certain complex (unit 
norm) initial vectors b for which min p6Tfc c ||p(A)6|| = 1 for k = 1, . . . , n — 1 (complete 
stagnation), while such complete stagnation does not occur for any real (unit norm) 
initial vector. Therefore, there are matrices for which ^k,c,&(A) < *&k,c,c(A). 

To show that the second inequality can be strict, we note that for any A € M. nxn , 
the corresponding matrix B G R 2 " x211 of the form (|6.1[) . and 1 < k < d(A) — 1, 



Hm.c(A) = max min \\p(A)b\\ 2 = 

tec™ p67Tfc r 
II & II = i 



max mm ||p(A)(w + l v) \\ 

l 2 + IMI 2 = i 



max 



min (\\p(A)u-r + \\p(A)vr) 

P6Tfc,P. 



(6.3) 



max min 



t (a). 



Now let A be the Toh matrix (j4~Tj) and fc = 3. Toh showed in [Ml Theorem 2.2] that 
for any unit norm b £ C 4 and the corresponding third GMRES polynomial pi, G 7T3 d 

\\ Pb (A)b\\ < <p 3 (A). 



Hence &3 t c,c{A) < <Pz{A)- Lemma I6T21 and equation (|6 . 3[) imply 1^3 (A) = Vf^R.cC-^), 
which completes the proof of the strict inequality. □ 
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Our proof concerning the strictness of the first inequality in the previous theorem 
relied on a numerical example given in |f 61 Section 4]. We will now give an alterna- 
tive construction based on the non-uniqueness of the worst-case GMRES polynomial, 
which will lead to an example with 

*fc,K,tt(^) < *fc,R,c(A). 

Suppose that A is a real matrix for which in a certain step k two different worst-case 
polynomials pi, G TTk.m and p c € TTk,m with corresponding real unit norm initial vectors 
b and c exist, so that 

^kMM(A) = \\Pb(A)b\\ = \\ Pc (A)c\\. 

Note that since p\, and p c are the uniquely determined GMRES polynomials that solve 
the problem for the corresponding real initial vectors, it holds that 

(6.4) ||p 6 (A)6|| < \\p(A)b\\, \\ Pc (A)c\\ < \\p(A)c\\ 

for any polynomial p G iTk..c \ {Pb,Pc}- 

Writing any complex vector w G C™ in the form w = (cos#) u + i (sinf?) v, with 
u,v G R™, ||u|| = ||v|| = 1, we get 



*fc,R,c(^) = max min ||p(A)6|| 



2 



= max mm (cos 6> p(A)u + sin p(A)v 2 ) 

II «• 11= II « ||=i 

> max min (cos 2 9\\p(A)b\\ 2 + sin 2 6»||p(A)c|| 2 ) 

> (cos 2 9) * 2 AR (A) + (sin 2 0) * 2 m (A) = f 1 >H>R (A), 

where the strict inequality follows from (|6.4p and from the fact that ||p(A)6|| 2 and 
||p(A)c|| 2 do not attain their minima for the same polynomial. 

To demonstrate the strict inequality tyk,R,m(A) < SS?k,R,c{A) numerically we use 
the Toh matrix (|4.1[) with e = 0.1 and u> = 1, and k = 3. Let & and c be the 
corresponding two different worst-case initial vectors introduced in Section |4] We 
vary 9 from to 7r and compute the quantities 

(6.5) min (cos 2 9 |b(A)6|| 2 + sin 2 9 \\p(A)c\\ 2 ) = min \\p{B)g g \\ 2 , 



where 

B = 



A 
A 



and gg = 



(cos6>)& 
(sin 9)c 



In Fig. 16.11 we can see clearly, that for 9 ^ {0, 7r/2,7r} the value of (|6.5p is strictly 
larger than * 3 (A) = 0.4579. 

7. Concluding remarks. We have studied the worst-case GMRES approxima- 
tion problem, which for each (nonsingular) matrix A and iteration step k < d(A) 
represents the best possible attainable upper bound on the actual GMRES residual 
norm for a linear algebraic system with A at step k. We have derived several theoret- 
ical properties of the worst-case GMRES problem, and we have studied its relation 
to the ideal GMRES approximation problem. 
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Fig. 6.1. The GMRES residual norm for a varying complex right hand side. 

In this paper we did not consider quantitative estimation of the worst-case GM- 
RES value 4'fe(^4), and we did not study how this value depends on properties of A. 
This is an important problem of great practical interest, which is largely open. For 
more details and a survey of the current state-of-the-art we refer to [HI Section 5.7]. 
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